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^ | Abstract 

Q , We investigate the application of Krylov space methods to the solution of shifted 

' linear systems of the form (A + a)x — b — for several values of a simultaneously, 

using only as many matrix-vector operations as the solution of a single system re- 
quires. We hnd a suitable description of the problem, allowing us to understand 
known algorithms in a common framework and developing shifted methods basing 
t^J- \ on short recurrence methods, most notably the CG and the BiCGstab solvers. The 

convergence properties of these shifted solvers are well understood and the deriva- 
tion of other shifted solvers is easily possible. The application of these methods to 
quark propagator calculations in quenched QCD using Wilson and Clover fermions 
is discussed and numerical examples in this framework are presented. With the 
0^ ' shifted CG method an optimal algorithm for staggered fermions is available. 



1 Introduction 



In many cases Krylov space solvers are the methods of choice for the inversion of large 
sparse matrices. While most Krylov space solvers are parameter free and do not have to 
be tuned to a particular problem, exploiting special algebraic properties of the matrix 
can lead to considerable acceleration of these algorithms. A recently discussed example 
is given by J-hermitean matrices, e.g. JM = J, where the number of matrix- vector 
products of algorithms like QMR or BiCG can be reduced by a factor of two if 
multiplications by J and J -1 are cheap. Another case which has been discussed in some 
detail recently is the application of Krylov space solvers to shifted equations, i.e. where 
the solution to 

(A + a)x-b = (1.1) 

has to be calculated for a whole set of values of a. This kind of problem arises in 
quark propagator calculations for QCD as well as other parts of computational physics 
(see Q). It has been realized that several algorithms allow one to perform this task 
using only as many matrix-vector operations as the solution of the most difficult single 
system requires. This has been achieved for the QMR Q, the MR Q and the Lanczos- 
implementation of the BiCG method [II. We present here a unifying discussion of the 
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principles to construct such algorithms and succeed in constructing shifted versions of 
the CG, CR, BiCG and BiCGstab algorithms, using only two additional vectors for each 
mass value. The method is also easily applicable to many other cases. The key to this 
construction is the observation that shifted polynomials, defined by 

P°{A + a) = cP n (A) (1.2) 

where P n (A) is the polynomial constructed in the Krylov space method, are often still 
useful objects. Since vectors generated by these shifted polynomials are simply scaled 
vectors of the original vectors, they are easily accessible. 

In the following sections we discuss the properties and construction of shifted poly- 
nomials in several cases. We then present the shifted versions of the above mentioned 
algorithms and finally perform some numerical tests. 

2 Shifted polynomials 

Our ultimate goal is to construct an algorithm for a whole trajectory of matrices A + 
a while only applying the matrix-vector operations for the inversion of one matrix, 
presumably the one with the slowest convergence. In the class of Krylov space solvers, 
one deals with residuals or iterates which are in some ways derived from polynomials 
P n (z) of the matrix A: 

v n = P n (A)v . (2.1) 

We simply define the shifted polynomial P%(z) as 

PZ(z + v) = <ZP n (z). (2.2) 

is determined by the normalization conditions for P n (z) required in the algorithm. 
It is easy to see that we can construct solvers which generate iterates of the form 

< = PZ(A)v (2.3) 

without additional matrix-vector products for multiple values of a since the calculation 
of 

{A + a)v a n = c a n (Av n + av n ) (2.4) 

can be derived from matrix- vector products of one single system. If P n (z) is a polynomial 
which reduces the vector vq, e.g. which is an approximation to in some complex region 
containing the relevant eigenvalues of A and c a n < 1, P£(z) will be a useful polynomial, 
too. 

Another class of useful polynomials are the Leja-polynomials, where the roots of 
the polynomial are given by the Leja points of a compact set K in the complex plane 
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not containing the origin implicitely defined by 

\z\ \ = sup [jar I (2-5) 
zeK 

i-i i-i 

J J \zj - :i. sii]) J J : :/. . zj G K, j = 2,---,n. (2.6) 

fc=l ^ eK fe=i 

The Leja points are usually not uniquely defined. The polynomial defined by 

n 

L n (z) = l[(l-z- 1 z) (2.7) 

4 = 1 

is a good approximation of in K. The application of Leja polynomials to matrix 
inversion problems has been described in ||. If z\ is translation invariant, e.g. 

zf +° = z? + a, (2.8) 

which is for example true if K is a circle with center on the positive real axis and a is 
real and positive, all Leja points are translation invariant and the shifted polynomial 
is exactly the Leja polynomial for the translated region K. The application of Leja 
polynomials to construct Krylov space methods for the Wilson matrix is currently under 
investigation. 

In the case of formally orthogonal polynomials, which are usually generated in CG 
and Lanczos-type algorithms, we can also see that the shifted polynomials are exactly the 
polynomials generated by the process for the shifted matrix. To see this, we introduce 

JC n (A,v ) = span{A l v ,i = 0- • • n - 1}. (2.9) 

The Lanczos polynomials Z n (z) have the property of formal orthogonality, namely 

Z n (A)v ±IC n (A,v ) (2.10) 

or, for the non-hermitean process, 

Z n (A)v ±)C n (Alw ) (2.11) 

for some vector wq. It should be noted that usually Z n (z) is uniquely determined up 
to a scalar constant (in the case it is not uniquely determined the Lanzcos process can 
break down Q). Since 

K n (A,v )=K n (A + a,v ), (2.12) 

we must have 

ZZ(z + a)=CZ n (z), (2.13) 

since Z°(z + a) is a formally orthogonal polynomial for A as well. We therefore expect 
that the polynomials generated in CG and Lanczos-type algorithms are of a shifted 
structure. We can indeed generate the exact processes for several values of a using only 
one matrix-vector operation each iteration by calculating the shifted polynomials. 

In the following we will show how to calculate the parameters of the shifted polyno- 
mials from the original process in the case of the above mentioned recurrence relations. 
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2.1 Two-term recurrences 



This recurrence is found in MR-type methods or in hybrid methods using MR-type 
iterations. We assume here that the leading coefficient is one. The polynomial is given 
directly product of its linear factors: 

n 

R n (z) = l[(l-xnz). (2.14) 
i=i 

To calculate the shifted polynomial, we look at a linear factor 

(1 - (z + a) X ') = c(l - z X ) (2.15) 

resulting in 

x= Y+1tx (2 - 16 » 

and 

(2.17) 



1 + 

The shifted polynomial is therefore given by 



RZU) = ff _ ( l ^—z] (2.18) 

= p° n R n {z-o) (2.19) 
^ft^-. (2.20) 



i=l 



If the spectrum of the matrix lies in the right half of the complex plane we can expect 
that all inverses of the roots lie there, too. We can then easily see that c a n > 1 for a > 0, 
so that the shifted polynomial converges better than the original polynomial with a 
rate growing with a. This is not surprising since we expect the condition number of the 
matrix A + a to decrease for a > 0. 

Let us construct an algorithm using this shifted polynomial. If the single update is 
given by 



r n +i = r n - XnAr n (2.21) 

X n +1 = X n + XnTn, (2.22) 



we can generate the solutions x° by 



^ rrb (2 ' 24) 

<+l = < + XnPnfn (2.25) 
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Remarkably, if Xn is generated by the minimal residual condition, this is exactly the 
same algorithm which was found in || with a completely different approach, namely 
by a Taylor-expansion of the residual in a and resummation of the series. This is not 
completely surprising, since in the derivation in ||] approximations were made to achieve 
that no additional matrix-vector products are needed and the small recursion length 
is kept, which automatically leads to the shifted polynomial. However, the Taylor- 
expansion becomes prohibitively complex when applied to algorithms like BiCGstab, 
whereas the shifted polynomial method can easily be transferred. 



2.2 Three-term recurrences 

Let us now apply these ideas to the case of three-term recurrences, which usually ap- 
pear in algorithms derived from the Lanczos process. We look at a general three-term 
recurrence relation of the form 

v n+ i = a n Av n + (3 n v n + jnVn-i = Z n+1 (A)v (2.26) 

We want to calculate the parameters of the shifted polynomial 

Z°{z + a)=CZ n {A), (2.27) 

The equations are given by matching the parameters of 



C,° + i{a n Av n + (3 n v n + 7„u n _i) = 



(2.28) 



The parameters are not completely fixed. One possible choice is 

C = l, < = ffi = p n -aa n , 7n = 7n- (2.29) 

This was realized in [Q] to construct the QMR and TFQMR method for shifted matrices. 
The Lanczos vectors v n are in fact independent of a. If we want to use v n directly as a 
residual, we impose the condition fi n + 7 n = 1. This determines the parameters of the 
shifted polynomial: 



< 


"51 

= OL n 

1 




= (/V 



(2.30) 



Sn 
(-a (a 

act _ Sn Sn— 1 /r, om 

with 7^ + (3° = 1 and the initial conditions £-i = Co = Po = 1- F° r the case of the 
Lanczos process it is easy to proof by induction that the parameters a^, Pn an d 7n are 
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indeed the parameters generated by the Lanczos process for the matrix A + a if the 
process does not break down. The update of the solution vector is given by 



<+l = <C>n + /« + 7n<-l- (2-33) 



This is basically the BIORESU75-algorithm from ||]. There the equations ( 2.29| ) are 
used and an overall normalization factor is recursively determined. It should be noted 
that this method does not only apply to the Lanczos process, but for general parameters 
a, f3 and 7. The shifted polynomial will then not be the polynomial generated for the 
shifted process, but the shifted systems still converge if d < 1. 

2.3 Coupled two-term recurrences 

Now let us turn to the more interesting case of coupled two-term recurrence relations. 
These relations have generally a superior numerical stability compared to the equivalent 
three-term recurrence. We look at recurrences of the CG-type form 

Pn = v n + a n p n -i = Q n {A)vQ (2.34) 

V n +1 = V n + P n Ap n = Z n+1 (A)v (2.35) 

where the initial condition pq = vq has been used. The method can simply be applied 
to a more general choice of parameters. We want to calculate the parameters needed 
to generate the shifted polynomial P°(z). Unfortunately Q a n will generally not be a 
shifted polynomial. This is however not a problem, since since we can calculate 
without additional matrix-vector products from 

Pn = On + 0&£-l (2.36) 

If the vectors (A + cr)p^ are needed, we can reformulate the recursion as follows: 

Pn = v n + a n p n -i (2.37) 
q n = Av n + a n q n -i (2.38) 
Vn+l = v n + j3 n q n . (2.39) 

We have q n = Ap n in exact arithmetic. Depending on the algorithm one or both vectors 
p and q have to be stored for all values of a. Let us calculate the parameters of the 
shifted process. To do this, we derive the three-term recurrence for v: 

v n+ i = a n v n + (3 n Av n + %v n -x. (2.40) 

The parameters are given by 

a n = n , hn = 1 + 7 n = 1 - a n (2.41) 

Pn-l 
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with the initial conditions qq = and (3-\ = 1. We thus find for the shifted parameters 



P° = /?n%i (2.42) 

Q n = Q «7^-3 ( 2 - 43 ) 

C„_lPn-l 

act _ CnCn—lPn-1 ,^ 

Wl " /W£-l ~ C°n) + £-iA,-i(l - c7/3 n ) 1 • J 

At the expense of calculating (A + <r)p^ by introducing an additional vector and addi- 
tional dot products, we can also calculate the shifted parameters (3% and a a n using the 
original formulae. These formulae do not only apply to the CG process, which will be 
demonstrated below. We have thus shown that one can implement coupled two-term 
recurrences for shifted matrices. 

We can now derive shifted versions of solvers based on these recursion relations 
by simply calculating the shifted parameters and using the proportionality between the 
shifted and original polynomials. Whether we succeed in deriving the shifted algorithm 
without any additional matrix- vector products depends on whether matrix- vector prod- 
ucts of vectors which are derived from polynomials which have no shifted structure are 
needed. In some cases we can eliminate these matrix-vector products by expressions 
involving other vectors. 



3 Shifted Krylov space solvers 

In this section we develop shifted algorithm variants of the following algorithms: CG, 
CR, BiCG, BiCGstab. In addition shifted versions of the solvers QMR, TFQMR and 
MR are known, so that for most popular Krylov space methods shifted solvers are 
available. Note that since TFQMR is based on CGS, the shifted version of the latter 
algorithm is basically also available. In Table |l| we present the currently known short 
recursion methods for shifted matrices with memory requirements. To avoid a prolifera- 
tion of new names we propose to simply add -M to the name of an algorithm to indicate 
its shifted version. 

Note that we cannot easily generalize this method to the CGNE algorithm, since 
(^4 + a)(A + cr)t is not generally a shifted matrix. For staggered fermions, however, we 
are in the lucky position that the matrix has the structure 

D = A + m, A ] = -A (3.1) 

with m real, so that 

DD^ = AA ] + m 2 (3.2) 

is a shifted matrix. Since the CG and CR algorithms are optimal for staggered fermions 
|2]|j|] , we have optimal shifted algorithms available for this case. For Wilson fermions the 
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Method 


Reference 


Memory 


MR-M 


i, this 


N 


CR-M 


this 


2N 


QMR-M (3-term) 




3N 


QMR-M (2-term) 


ED 


3N 


TFQMR-M 


i 


5N 


BIORESU 


§, this 


2N 


BiCG-M 


this 


2N 


BiCGstab-M 


this 


2N + 1 



Table 1: Memory requirements and references for shifted system algorithms for unsym- 
metric or nonhermitean matrices. We list the number of additional vectors neccessary 
for N additional values of a ( which is independent of the use of the 75-symmetry) . 

interesting algorithms are MR and BiCGstab, the former due to its simple implemen- 
tation and small memory requirements and the latter due to its superior performance 
and stability; see e.g. ||]. 

3.1 CG-M, BiCG-M, BiCG7 5 -M 

We present here a version of the CG algorithm for shifted matrices. The variants BiCG 
and BiCG75 are derived analoguosly. Note that the initial guess has to be set to zero. 



CG-M algorithm : 

xl = 0,r =pl = = <Zi = Co = IX = 

for i = 0,1,2,- •• 

^ _ iTurj) 

calculate according to (2.42) - (2.44) 

x i+l = x i ~ A Pi 

(r i+1 ,r i+1 ) 

oa+l = — 7 s — 

{n,ri) 

calculate af +1 

This algorithm is a straightforward realization of the formulae (2.42) - (2.44). Note 
that we need only 2 additional vectors for each value of a even in the nonsymmetric 
BiCG case, since we can calculate the parameters from the parameters of a single system. 
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A has to be chosen in a way that £f < 1 for some i, which means that a = corresponds 
usually to the system with the slowest convergence. 

3.2 CR-M 

The CR algorithm is the truncated version of the generalized conjugate residual method 
which is a coupled two-term version of the GMRES algorithm (see [Q] and references 
therein). We formulate an algorithm which applies the shifted polynomials to the shifted 
matrices. The algorithm applied to the shifted matrix does in this case not necessar- 
ily generate the shifted polynomial. The structure is identical to the CG-M but the 
parameters are calculated differently, namely we have 

* = - ra <3 - 3) 

Note that formulae (2.42)-(2.44) still apply although we do not generate the Lanczos 
polynomial. Note also that we do not know a priori whether this algorithm converges 
for the shifted systems. This has to be checked by testing 

Cf < I- (3-5) 

If A has only eigenvalues with positive real part, we can however expect that j3 is 
generally negative and a positive. If we have Cn—l > Cn we can easily see from formula 
(2.44) that Cn > Cn+i follows. This suggests that we can expect convergence if the zero 
shift corresponds to the system with the worst condition, which was confirmed in tests 
with the Wilson fermion matrix. 

3.3 BiCGstab-M 

In the BiCGstab algorithm ||, we generate the following sequences 

r n = Z n (A)R n (A)r Q (3.6) 
w n = Z n (A)R n ^(A)r (3.7) 
s n = Qn{A)R n (A)r (3.8) 

where Z n (z) and Q n ( z ) are the BiCG-polynomials and 

n 

Rn(z) = l[(l-^z), (3.9) 

1=1 

where the parameters Xi are derived from a minimal residual condition. For the shifted 
algorithm we have 

< = (Zp°Z n (A)R n (A)r (3.10) 
w^ = CnPn~iZ n (A)R n - 1 (A)r . (3.11) 
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The update of the solution has the form 



%n+l — %n Pn s n ~i~ Xn^n+1- 



(3.12) 



The problem is that the update of s n itself requires the calculation of As n , which 
straightforwardly means we have one additional matrix-vector multiplication for each 
value of a. But we can use the relation 



to eliminate this matrix-vector product at the expense of one auxiliary vector to store 
r n . This method is safe since j3 n = only if the algorithm breaks down anyway. The 
complete algorithm is then given by (note that Sj = s? =0 ) 

BiCGstab - M algorithm : 

x% = 0,r = sl = 6,/3_i = C-i = Co = Po = = 0, 
wq so that 5q = WqVq ^ 0, 4>q = w\Asq/5q ^ 
for i = 0,1,2,- •■ 




calculate PfXi+i according to (2.42) - (2.44) 

Wi+i = n + PiAsi 

_ (Aw i+ i)^w i+ i 
(Aw i+1 ) j f Aw i+1 
calculate Xi , Pi+i according to (2.23) - (2.24) 
r i+ i = w i+ i - XiAwi+i 

x i+l =Xi~ (3° Si + x1pKi+l W i+l 

= w ] Q r i+1 



( A + °K = 7^(C+lP>n+l " C»n) 



(3.13) 




calculate afj 



le ix i+1 
r i+ i + a i+1 (si - XiAsi) 



Si+l 




The convergence of the shifted algorithms can be verified by checking that 



(3.14) 
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It is however generally advisable for all shifted algorithms to test all systems for con- 
vergence after the algorithm finishes since a loss of the condition ( |2.2| ) due to roundoff 
errors might lead to erratic convergence. 



4 Preconditioning 

There are two major limitations to shifted algorithms which diminish their usefulness 
considerably. First, we have to start with the same residual for all values of a, which 
means that cannot have <T-dependent left preconditioning. Secondly, preconditioning 
must retain the shifted structure of the matrix. While preconditioning can reduce the 
computational effort it has also the important property of numerically stabilizing the 
inversion algorithm, which is essential to achieve convergence in many cases. 

A class of preconditioners which is potentially suitable for shifted systems are poly- 
nomial preconditioners. We note here that we do not expect to considerably accelerate 
the matrix inversion algorithms by polynomial preconditioning in the case of the Wilson 
matrix since the polynomials generated by methods like BiCGstab are already nearly 
optimal (8|. We apply a preconditioning polynomial P n (z) and solve the equation 

P n (A)Ay = b, x = P n (A)y. (4.1) 

P n (z) will generally depend on the shift a, so we are looking for polynomials P n)CT {z) 
which statisfy 

P n , a (A + a)(A + a) = P nfi (A)A + n (4.2) 

and which are good preconditioners. For the linear case, the general solution is 

Pl, a (z) = 2a + a-z (4.3) 

where a is an arbitrary constant. The case a = was proposed for the Wilson fermion 
matrix in ||, leading to the preconditioned matrix 



i 2 - D en D nP 



m 2 - D nP D e 



(4.4) 



which is fortunately a reasonable preconditioner for the Wilson fermion matrix, so 
that the total work is approximately the same as for the unpreconditioned system. 
We lose (for general sources) however a factor of two compared to the usual even-odd 
preconditioning. 

We assume that generally we do not have to worry too much if P n ^z) is a good 
preconditioner for a > since usually these systems converge faster. Problems only arise 
if the desired precision is close to the precision where the residuals stagnate. Given a 
preconditioner of the form 

n 

Pn(z) = Y[( ri + z) (4.5) 
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200 400 600 800 1000 

matrix multiplications 



Figure l: Convergence history for BiCGstab for k = 0.157, 0.16, 0.162, 0.165, 0.16625, 
0.1675. The BiCGstab algorithm uses even-odd and BiCGstab-M linear polynomial 
preconditioning. Note that BiCGstab-M needs 4 matrix multiplications per iteration. 



we can calculate the preconditioner 

n 

PnAz) = T[W+z) (4.6) 

i=l 

by requiring that (|4.2| ) holds, which results in a system of n equations for the parameters 
rf. Suitable polynomials can for example be constructed from Chebychev-, Leja- or 
GMRES-polynomials. We will not examine this approach further and only apply linear 
preconditioners in our numerical tests. 



5 Numerical tests 

The algorithms were tested on 16 3 x 48 quenched SU(3) configurations at f3 = 5.7, 
fixed to Coulomb gauge. We used generally 32-bit precision for the vectors and matrix 
and 64-bit precision for the accumulation of dot products and parameter recursions. 
The tests were performed on a Cray T3D machine using the MILC code basis and 



configurations. Other tests of the QMR and MR methods can be found in pQ,p,ll 
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16 3 48, p=5 .7, Clover 
BiCGstab 




200 400 600 800 1000 

matrix multiplications 



Figure 2: Convergence history for BiCGstab for k = 0.136, 0.38, 0.140, 0.141, 0.142, 
0.14235. 

5.1 Wilson fermions 

The set of hopping parameter values was taken from an actual heavy-light calculation 
with gaussian wall sources. We compared the results against t even-odd preconditioned 
BiCGstab using the result of lower k values as initial guesses. We also applied the 
methods to Clover fermions on the same configurations. We performed tests for two 
lattices, two spin- and colorindices and sources of size 2 and 6. We found comparable 
results in all cases. 

In Figure |l| we show the convergence history of a sample run with Wilson fermions 
taken from an actual production run for heavy-light systems. The method is (averaged 
over our test runs) only about 14% faster than BiCGstab with continued guesses, which 
is due to the fact that the gap between the light mass and the heavier masses is too large. 
The desired accuracy was 10 -5 for the 3 heavier and 10 -4 for the lighter masses. It is 
easy to see, however, that this factor increases rapidly for mass values which lie closer 
together, since the continued guess method cannot keep the total number of matrix 
multiplications constant in contrast to shifted methods. The method is advantageous 
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in a specific case with n masses (we assume nonlocal sources here), if 



J2 N™ nt - 2N* CTO s ucss > 0. (5.1) 
i=i 

The first term is simply the total number of iterations using the standard algorithm, 
the last term is twice the number of iterations for the slowest system using a zero guess. 
Obviously one can construct examples where this number can become very large. Note 
that for point sources the shifted method wins another factor 2 in the Wilson fermion 
case. We also tested the MR-M method with an overrelaxation parameter u = 1.2. 
While the MR algorithm performed comparably to the BiCGstab algorithm in this 
situation, we found that the residuals of the higher mass systems stagnate at a value of 
« 10 -2 . This problem was less pronounced on smaller lattices, so that we assume that 



it is connected to a loss of condition (2.2) due to roundoff errors. It might however also 
be connected to our specific implementation. The same problem can also be seen in ||] 
in Figure 2. 

5.2 Clover fermions 

We used the tadpole-improved value of the clover constant csw = 1-5678 and values 
of k so that the inversion takes approximately as long as in the Wilson case. For the 
BiCGstab algorithm we used the preconditioned matrix 





1 



(5.2) 



For BiCGstab-M we used the linear preconditioner (O) with a = 0. The precondi- 
tioned matrix does not separate nicely like in the Wilson case, which makes however 
no difference in the computational effort for general sources. It does however serve its 
main purpose, namely to stabilize the algorithm sufficiently so that it converges in our 
test cases. We find that the implementation of the preconditioner is important in the 



sense that a violation of condition (4.2) due to roundoff errors can lead to a stagnation 
of the shifted residuals. The number of iterations needed with zero initial guess is ap- 
proximately the same for the BiCGstab and BiCGstab-M for the smallest mass which 
means that the linear preconditioner reduces the condition of the matrix as well as the 
preconditioner ( |5.2[ ) . The further conclusions are therefore similar to the Wilson fermion 
case. In Figure |2| we show a convergence history for a system with clover fermions. Note 
that we saw examples of a loss of precision in the shifted residuals which lead to early 
stagnation, so that it is advisable to check the residuals of the shifted systems for con- 
vergence. Here the mass values lie effectively closer together and a bigger improvement 
can be seen. 
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6 Conclusions 



We presented a simple point of view to understand the structure of Krylov space al- 
gorithms for shifted systems, allowing us to construct shifted versions of most short 
recurrence Krylov space algorithms. We developed the shifted CG-M and CR-M algo- 
rithm which can be applied to staggered fermion calculations. Since efficient precondi- 
tioned for the staggered fermion matrix are not known, a very large improvement by 
these algorithms can be expected. We also presented the BiCGstab-M method, which, 
among the shifted algorithms, is the method of choice for quark propagator calculations 
using Wilson (and presumably also Clover) fermions if enough memory is available. It 
becomes available simply by extending existing BiCGstab implementations. We inves- 
tigated the efficiency of this method in realistic applications and found that, for sources 
other than point sources, the improvement depends heavily on the values of the quark 
masses. The improvement is generally higher for masses which lie closer together. The 
numerical stability of convergence of the shifted systems is found to be very good so that 
this method is feasible in 32-bit arithmetic. The application of this method to Clover 
fermions is possible. Using simple linear polynomial preconditioning we can stabilize 
the solver sufficiently even for relatively small quark masses. We proposed a way to 
apply higher order polynomial preconditioners to shifted matrix solvers which may be 
neccessary in the case of very small quark masses. Roundoff errors might however in 
some cases affect the convergence of the shifted systems so that the final residuals have 
to be checked. 
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